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Theoretical  and  Experimental  Studies 
of  Auditory  Processing 

Shihab  A.  Shamma  and  P.S.  Krishnaprasad 

This  proposal  describes  the  theoretical  and  experimental  research  into  the  principles  of 
sound  processing  in  the  auditory  system.  The  proposal  is  divided  into  two  parts.  The  first  is  a 
review  of  the  work  we  accomplished  in  the  last  three  years  under  the  AFOSR  grant  (F49620- 
92-J-0500).  The  second  part  outlines  our  proposed  research  plans  for  the  next  three  years. 

Part  I:  Review  of  Research  Results 

Summary 

The  research  reported  here  has  been  conducted  over  the  last  three  years  under  the  AFOSR 
grant  (F49620-92-J-0500).  It  is  divided  into  four  general  categories  of  projects:  (1)  VLSI 
implementations  of  the  early  auditory  stages.  (2)  Functional  organization  of  the  auditory 
cortex:  Neurophysiology  (3)  Functional  models  of  the  auditory  system:  Psychoacoustics.  (4) 
Analysis  of  neural  network  architectures  with  wavelet  transforms.  We  shall  review  briefly  the 
main  results  achieved  in  these  four  areas. 

Besides  the  two  P.I.s’  salaries,  the  grant  supported  several  Ph.D.  and  M.S.  students,  a 
laboratory  manager,  and  partially  a  post-doctoral  fellow  (see  list  of  names  and  degrees  at  the 
end  of  this  review). 

I.  VLSI  Design  and  Implementations  of  Early  Auditory  Processing 

Over  the  last  few  years,  we  have  been  developing  and  analyzing  detailed  models  of  the  early 
auditory  stages.  Our  goal  is  to  understand  the  underlying  signal  processing  principles  that 
endow  such  systems  with  their  noise  robustness  and  feature  enhancement  abilities.  In  order  for 
these  models  to  become  useful  components  in  such  applications  as  automatic  speech  recogniBon, 
their  computational  cost  has  to  be  reduced  drastically.  One  approach  to  accomplish  this  is  to 
implement  the  algorithms  in  VLSI  using  (S)witched  (C)apacitor  (F)ilters  (SCF)  because  they 
provide  several  advantagesflO,  11,  15,  16]. 

•  Filter  pole  positions  are  determined  not  by  the  RC  products,  but  by  capacitor  ratios. 

•  Capacitor  ratios  can  be  precisely  controlled  and  are  stable  with  temperature.  Further¬ 
more,  accurate  filter  transfer  functions  can  be  implemented  in  a  completely  monolithic 

form. 

•  SCF’s  require  very  little  silicon  area  to  implement  high-value  resistors. 

Over  the  last  year,  we  have  succeeded  in  developing  such  a  system.  Many  obstacles  were 
solved  along  the  way,  including  the  design  of  area  efficient  integrators  working  at  the  relatively 
low  acoustic  frequencies,  and  offset  and  parasatic  insensitive  Op-Amps  for  the  channel  adders 
(Fig.l).  Details  of  these  and  many  other  design  innovations  are  available  in  the  paper  by  Lin  et 
al.  and  J.  Lin’s  Ph.D.  thesis  accompanying  this  review.  VLSI  chips  of  up  to  32  channels/chip 
that  can  be  combined  in  parallel  to  form  a  64  channel  system  have  been  fabricated  using  MOSIS. 
A  patent  of  the  SCF  design  innovations  was  also  granted  (see  Fig.2  for  details). 
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Figure  3:  (a)  Schematic  of  the  transformation  of  the  acoustic  stimulus  into  an  auditory  spectrum 
and  then  into  a  2-dimensional  pattern  of  activity  in  the  auditory  cortex.  Responses  of  units 
along  the  isofrequency  planes  are  differentiated  by  their  binaural  preferences,  in  addition  to 
various  monaural  properties,  (b)  Xhree  monaural  response  properties  that  are  distinguishable 
along  the  isofrequency  planes:  Asymmetry,  bandwidth,  and  FM  directional  selectivity  of  the 
response  fields. 


II.  Functional  Organization  of  the  Auditory  Cortex:  Neurophysiology 

There  are  several  ongoing  projects  to  explore  the  functional  organization  of  the  primary  and 
anterior  auditory  cortex.  They  range  from  neurophysiological  mappings  of  the  responses  of  the 
various  areas  of  the  auditory  cortex,  to  a  detailed  comparison  of  responses  across  the  different 
fields,  to  the  exploration  of  the  linearity  of  cortical  responses  using  complex  broadband  stimuli. 
Results  from  these  projects  are  summarized  below. 

11. 1  Neurophysiological  mappings  of  the  primary  auditory  cortex 

The  primary  auditory  cortex  (AI)  is  essential  for  the  perception  and  localization  of  sound. 
Its  precise  role  in  carrying  out  these  functions,  however,  remains  a  mystery  despite  extensive 
knowledge  gained  from  ablation  experiments  and  from  single  and  multi-unit  recordings  with 
various  complex  stimuli.  Two  general  organizational  features  of  AI  have  been  previously  firmly 
established:  the  spatially  ordered  tonotopic  axis,  and  the  alternating  bands  of  binaural  response 
properties  that  run  perpendicularly  to  the  isofrequency  planes  (Fig. 3a).  These  axes  relate  to 
basic  simple  properties  of  the  acoustic  stimulus  that  are  already  established  at  much  lower 
levels  of  the  auditory  pathway.  With  the  exception  of  the  more  specialized  auditory  system 
of  the  bat,  ordered  responses  to  more  complex  stimulus  features,  analogous  to  the  orientation 
columns  and  direction  of  motion  selectivity  in  the  visual  cortex,  have  been  more  difficult  to 
find  in  AI.  At  present,  only  a  few  reports  hint  at  the  existence  of  such  maps  in  AI. 

This  issue  was  addressed  in  a  series  of  experiments  in  the  ferret  AI,  the  results  of  which 
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(0-^  Distribution  of  the  2-tone  response  symmetry  measure,  M,  in 
the  ferret  primary  auditory  cortex  from  2  animals  (65  and  74).  Circles: 
locations  of  the  electrode  penetrations  along  the  isofrequency  contours 

{ - ).  Asterisks:  penetrations  with  w'eak  auditory  responses.  The  arc  in 

each  map  represents  the  location  of  the  suprasylvian  fissure;  the  dashed 
lines  delineate  the  approximate  borders  of  the  band  within  which  the  M 
measure  changes  from  extreme  negative  (o)  to  extreme  positive  values  (•). 
A  key  for  the  classification  scheme  used  is  shown  on  the  /eJL  The  medial 
(M)  and  rostral  (R)  directions  are  indicated  by  the  arrows:  the  arrow 
lengths  represent  0.5-mm  distances  on  the  surface  of  the  cortex. 


(b")  comparison  of  the  topographic  dis¬ 
tribution  of  2-tone  and  frequency  modulation 
(FM)  responses  in  primary'  auditory'  conex  (AI). 
Map  on  the  /e/r.  index  M  derived  as  in  Fig.  1 1. 
Map  on  the  corresponding  distribution  of 
the  C  index  computed  from  FM  responses  of  the 
same  cells  as  in  the  /eU  map.  Map  features  and 
symbols  are  as  in  Fig.  CL’  Filled  circles:  penetra¬ 
tions  with  selective  responses  to  downward 
sweeps.  Clear  circles:  penetrations  with  selective 
responses  to  upward  sweeps.  Partially  shaded 
penetrations  are  less  selective. 


Figure  4: 


were  published  in  Shamma  et  al.(1992)  appended  to  this  review.  The  study  explored  the 
detailed  organization  of  the  excitatory  and  inhibitory  responses  of  cortical  cells,  i.e.,  their 
so-called  receptive  fields.  The  aim  was  to  establish  whether  any  systematic  changes  in  the 
balance  of  inhibitory  and  excitatory  responses  occur  in  cells  along  the  isofrequency  planes  and, 
if  so,  to  determine  the  implications  of  these  changes  to  responses  to  frequency-modulated  (FM) 
tones  and  spectrally  shaped  noise  stimuli.  These  response  features  are  more  complex  than  the 
determination  of  a  single  best  frequency  BF  (tonotopicity)  or  the  (binary)  nature  of  a  binaural 
interaction  (e.g.,  an  Excitatory- Excitatory  or  Excitatory-Inhibitory  response).  The  receptive 
field  of  a  cell  represents,  to  first  order,  its  transfer  function,  i.e.,  the  way  it  filters  or  processes  the 
input  spectrum.  Similarly,  FM  tones  reveal  information  about  the  dynamic  interplay  between 
of  the  inhibitory  and  excitatory  responses  of  the  cell. 

The  basic  findings  of  the  above  experiments  is  the  existence  of  a  spatially  ordered  change  in 
the  symmetry  of  the  receptive  fields  in  any  given  isofrequency  plane  in  AI  (Fig. 4).  Considering 
the  results  of  experiments  from  over  20  animals,  the  outline  of  the  distribution  is  as  follows  : 
At  the  center  of  AI,  units  respond  with  a  narrow  excitatory  tuning  curve  at  BF,  flanked  by 
narrow  symmetric  inhibitory  side-bands.  The  receptive  fields  become  more  asymmetric  away 
from  the  center.  In  one  direction  (caudally  in  the  ferret  AI),  the  inhibitory  side-bands  above 
the  BF  become  relatively  stronger.  The  opposite  occurs  in  the  other  direction.  These  response 
types  tend  to  organize  along  one  or  more  bands  that  parallel  the  tonotopic  axis  (i.e.,  orthogonal 
to  the  isofrequency  planes). 

Many  more  response  properties  were  also  examined.  These  include  the  relation  between 
responses  to  spectrally  shaped  noise  and  the  symmetry  of  the  receptive  fields,  the  selectivity  of  a 
cell’s  response  to  the  direction  of  an  FM  tone  in  relation  to  its  receptive  field  symmetry  (Fig. 4), 
and  the  dependence  of  these  properties  on  stimulus  parameters  such  as  tone  intensities  and 
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Figure  5:  (a)  Schematic  of  the  three  organizational  axes  of  the  response  fields  in  AI 


inter-tone  delays.  Another  important  finding  of  these  mapping  was  the  columnar  organization 
of  the  responses,  in  which  all  units  sampled  in  a  given  penetration  were  found  to  exhibit  roughly 
similar  receptive  field  symmetry  and  FM  selectivity. 

A  fundamental  conjecture  suggested  by  these  results  was  that  units  along  the  symmetry 
axis  in  fact  encoded  by  their  differential  distribution  along  the  isofrequency  planes,  a  local 
measure  of  the  shape  of  the  acoustic  spectrum  -  specifically,  the  locally  averaged  gradient  of 
the  spectrum.  This  conjecture  follows  from  the  schematics  of  Fig. 3b  where  best  responses 
to  spectral  peaks  or  edges  of  different  symmetries  are  mapped  systematically  across  the  AI. 
The  significance  of  such  a  map  stems  from  its  enhancement  and  explicit  representation  of 
such  perceptually  important  features  as  the  shape  of  spectral  peaks,  edges,  and  the  spectral 
envelope.  This  gradient  map  can  be  viewed  as  a  one  dimensional  analogue  of  the  orientation 
columns  of  the  visual  cortex,  since  the  orientation  of  a  two-dimensional  edge  simply  entails 
specifying  its  gradients  in  two  directions.  These  physiological  results  in  turn  suggested  a  series 
of  psychoacoustical  experiments  that  are  summarized  later  in  this  review. 

II.2  Comparison  of  the  responses  in  the  anterior  and  primary  auditory  fields 

The  characteristics  of  an  anterior  auditory  field  (AAF)  in  the  ferret  auditory  cortex  were 
described  in  terms  of  its  electrophysiological  responses  to  tonal  stimuli  and  compared  to  those 
of  primary  auditory  cortex  (AI).  The  AAF  is  located  dorsal  and  rostral  to  AI  on  the  ectosylvian 
gyrus  and  extends  into  the  suprasylvian  sulcus  rostral  to  AI.  The  tonotopicity  is  organized  with 
high  frequencies  at  the  top  of  the  sulcus  bordering  the  high-  frequency  area  of  AI,  then  reversing 
with  lower  BFs  extending  down  into  the  sulcus.  AAF  contained  single  units  that  responded 
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to  a  frequency  range  of  0.3  -  30  kHz.  Best  frequency  (BF)  range,  rate-level  functions  at  BF, 
FM  directional  sensitivity,  and  variation  in  asymmetries  of  response  areas  were  all  comparable 
characteristics  between  AAF  and  AI.  Responses  in  both  areas  were  primarily  phasic.  The 
characteristics  that  were  different  between  the  two  cortical  areas  were:  latency  to  tone  onset, 
excitatory  bandwidth  20  dB  above  threshold  (BW20)  and  preferred  FM  rate,  as  parameterized 
with  the  centroid  (a  weighted  average  of  spike  counts).  The  mean  latency  of  AAF  units  was 
shorter  than  in  AI  (16.8  ms  AAF,  19.4  ms  AI).  BW20  measurements  in  AAF  were  typically 
twice  as  large  as  those  found  in  AI  (2.5  oct  AAF,  1.3  oct  AI).  The  AI  centroid  population  had 
a  significantly  larger  standard  deviation  than  the  AAF  centroid  population.  The  relationship 
between  centroid  and  BW20  was  also  examined  to  see  if  wider  bandwidths  were  a  factor  in  a 
unit’s  ability  to  detect  fast  sweeps.  There  was  significant  (P<  0.05)  linear  correlation  in  AAF 
but  not  in  AI.  In  both  fields,  the  variance  of  the  centroid  population  decreased  with  increasing 
BW20.  BW20  decreased  as  BF  increased  for  units  in  both  auditory  fields. 

II.3  Characterization  of  AI  responses  using  broadband  spectral  ripples 

As  reviewed  earlier  in  Sec.II.l,  response  areas  of  cells  along  the  isofrequency  planes  of 
the  mammalian  primary  auditory  cortex  (AI)  are  systematically  organized  with  respect  to 
two  properties:  their  excitatory  bandwidths  and  their  asymmetry  (Fig.5).  To  measure  the 
response  areas,  these  investigations  employed  simple  tones  which  can  be  thought  of  as  impulse¬ 
like  stimuli  along  the  tonotopic  axis.  If  cortical  cells  were  to  respond  linearly,  the  measured 
response  areas  would  reflect  the  “impulse  responses”  of  the  system  along  the  tonotopic  axis, 
and  hence  could  be  used  to  predict  the  system’s  responses  to  arbitrary  spectra.  Furthermore, 
by  Fourier  transforming  the  impulse  response,  one  would  obtain  the  corresponding  “transfer 
function”,  which  represents  the  system’s  response  to  sinusoidally  modulated  spectra  (Fig.  IB), 
more  commonly  known  in  the  psychoacoustical  literature  as  rippled  spectra.  Consequently, 
response  properties  measured  by  tonal  stimuli  might  be  equally  evident  from  their  ripple  transfer 
function. 

The  suggestion  that  cortical  cells  are  linear  might  appear  farfetched  given  that  their  rate- 
level  functions  often  exhibit  threshold,  saturation,  and  nonmonotonic  behavior  Nevertheless, 
just  as  measuring  with  tones  a  cell’s  bandwidth,  tuning  quality  factor,  or  other  linear  systems 
response  properties  is  considered  meaningful,  certain  characteristics  of  the  ripple  responses  may 
also  prove  useful,  or  possibly  related  to  the  properties  measured  with  tones.  It  is  possible  as 
well  that  nonlinearities  observed  with  tonal  stimuli  are  less  troublesome  with  broadband  rippled 
spectra,  or  negligible  over  a  certain  range  of  stimulus  parameters. 

An  analogous  situation  to  the  above  has  long  existed  in  experimental  studies  of  auditory- 
nerve  responses.  There,  nonlinearities  such  as  firing  rate  rectification,  saturation,  two-tone 
suppression,  and  adaptation  are  prevalent  (see  review  in  Pickles  1986).  These  nonlinearities, 
however,  did  not  impede  measurements  of  transfer  characteristics  of  auditory-nerve  fibers  using 
single  tones,  noise  stimuli,  or  acoustic  clicks,  all  implying  strong  linear  components  in  the 
responses. 

Our  primary  goal  in  this  work  was  to  mea.sure  the  responses  of  AI  cells  to  rippled  spectra 
at  various  ripple  frequencies  and  phases,  i.e.,  to  measure  their  ripple  transfer  functions,  and 
the  dependence  of  this  function  on  the  amplitude  of  the  ripples  and  the  overall  intensity  of  the 
sound.  A  second  objective  was  to  compare  characteristic  features  of  these  transfer  functions  to 
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Figure  (yC\,  (A)  Rippled  complex  stimulus  composed  of  101  tones  equally  spaced  along  the  logarithmic 
frequency  axis  between  1  and  20  kHz,  The  envelope  is  sinusoidally  modulated  on  a  logarithmic 
frequency  axis  either  on  a  linear  (left  ordinate)  or  a  logarithmic  (right  ordinate)  amplitude  scale. 
The  ripple  phase  is  defined  relative  to  the  start  of  a  npple  (i.e.,  low-frequency  edge).  The  ripple 
stimuli  were  varied  m  ripple  frequency  Q  (0  -  4  cycle/octave  in  steps  of  0.2  or  0.4)  or  ripple  phase 
(O-J'Tj^  in  ^/4  steps).  The  stimulus  duration  was  50  ms.  (B)  Dot  display  of  responses  of  an  AI 
cell  (BF  =  7-5  kHz)  to  a  ripple  of  a  fixed  Q  at  various  ripple  phases.  The  stimuli  were  started  at 
100  ms.  Spike  counts  are  computed  over  a  50  ms  window  as  indicated  by  the  bold  arrowSf  and  are 
displayed  in  the  inset  plot  to  the  right.  (C)  Spike  counts  as  a  function  of  ripple  phase  for  various 
ripple  frequencies  Q.  The  spike  counts  are  indicated  by  the  circles,  and  for  each  Q  the  abscissa  is 
placed  at  the  spike  count  averaged  over  all  phases.  The  solid  line  is  the  best  mean-square- error  fit  to 
a  sinusoid. 


a(cyde/oct)  Q(cycle/oct)  Frequency  (kHz) 

Figure  (jh  (A-C)  Examples  of  ripple  responses  from  three  cells.  Left:  magnitude  transfer  function 
|r(n)l;  middle:  phase  transfer  function  $(0);  right:  response  field  (inverse  Fourier  transform  of 
transfer  function).  Note  that  width  of  \T{Q.)\  is  about  1  octave.  (D)  Distribution  of  and  phase 
$0  of  single  units  in  different  experiments  in  ferret  .AI. 

Figure  6: 
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Figure  6c  Coriical  maps  ofQo  (A)  and  (B)  for  muliiunii  recordings  in  ferrets  #158  and  #155. 
The  scaling  bar  on  the  left  indicates  the  values  represented  by  the  grey  intensities.  A  Gaussian 
weighted  filling  was  applied  to  obtain  the  values  between  penetration  locations  (marked  by  circles). 
Original  values  at  the  electrode  locations  were  preserved.  The  map  is  made  on  the  basis  of  responses 
where  |$;,|  <  lOO®.  The  dashed  lines  connect  local  maxima  of  Qg,  <^'o.d  maxima  and  minima  of 
across  isofrequency  planes. 


response  properties  measurable  using  tonal  stimuli,  such  as  the  bandwidth  or  the  asymmetry 
of  the  response  area. 

Such  an  approach  has  proven  fruitful  in  analogous  studies  of  the  primary  visual  cortex  (De 
Valois  and  De  Valois  1988).  There,  transfer  functions  measured  using  sinusoidally  modulated 
gratings  reveal  much  about  the  functional  organization  of  the  system,  and  its  response  to  more 
complex  stimuli  such  as  oriented  bars.  In  auditory  physiology,  such  stimuli  have  only  been 
reported  by  Calhoun  and  Schreiner  (1993).  Recently,  several  psychoacoustical  studies  (Hillier 
1991;  Summers  and  Leek  1994;  Vranic-Sowers  and  Shamma  1994a,  1994b)  have  converged  on 
the  similar  notion  that  measuring  the  perceptual  thresholds  of  rippled  spectra  may  help  explain 
how  spectral  profiles  are  perceived. 

The  results  we  obtained  in  extensive  recordings  of  single  units  in  the  primary  auditory 
cortex  of  the  ferret  can  be  summarized  as  follows.  Using  broadband  stimuli  (1-20  kHz)  with 
sinusoidally  modulated  spectral  envelopes  (ripples),  the  response  magnitude  of  each  cell  was 
measured  as  a  function  of  ripple  frequency  (H)  and  ripple  phase  ($),  from  which  a  ’’ripple 
transfer  function”  was  constructed  (Fig.6a-b).  Most  cells  (approximately  90%)  responded  best 
around  a  specific  (characteristic)  ripple  frequency,  Oq.  Values  of  Vto  range  from  0.2  to  3  cy¬ 
cles/octave,  with  the  average  of  the  distribution  around  1.0  (Fig.6b).  Most  cells  also  exhibited 
a  linear  ripple  phase  as  a  function  of  D.  The  intercept  of  the  phase  function  is  interpreted 
as  the  best  (characteristic)  ripple  phase  to  drive  the  cell,  $£,;  the  slope  of  the  line  reflects  the 
location  of  the  response  area  of  the  cell  along  the  tonotopic  axis.  $o  ranges  over  the  full  cycle 
in  a  Gaussian-like  distribution  around  0°  (Fig. 6b).  By  inverse  Fourier  transforming  the  transfer 
function,  a  “response  field”  (RF)  of  the  cell  was  obtained,  an  analogue  of  the  response  area 
measured  with  tonal  stimuli.  Parameters  of  the  RF  were  compared  to  parameters  of  the  tonal 
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response  area.  The  BF  of  the  RF,  BFrf,  was  very  similar  to  the  tonal  BF,  and  Clg  and 
were  weakly  but  significantly  correlated  to  the  excitatory  bandwidth  and  asymmetry  index  of 
the  tonal  response  area,  respectively.  The  RF  was  found  to  be  a  stable  measure  of  a  unit’s 
response  regardless  of  ripple  amplitudes  or  overall  stimulus  levels.  Responses  to  rippled  spectra 
in  AI  closely  resemble  the  response  properties  to  sinusoidal  gratings  in  the  primary  visual  cortex 
(VI).  This  provides  a  unified  framework  within  which  to  interpret  the  functional  organization 
of  both  cortices.  Details  of  this  work  are  found  in  the  manuscript  entitled  “Ripple  Analysis  in 
the  Primary  Auditory  Cortex  (Part  I)”,  copies  of  which  are  appended  to  this  proposal. 

We  have  also  examined  the  topographic  distribution  of  response  parameters  using  the  ripple 
and  tonal  stimuli  in  the  primary  auditory  cortex  (AI)  (Fig.6c).  Both  single-unit  and  multiunit 
recordings  were  used  in  these  studies.  As  before,  for  each  unit  or  cluster,  responses  to  rip¬ 
ples  were  parametrized  in  terms  of  the  characteristic  ripple  Do  and  phase  $o  (i-e.,  the  best 
ripple  frequency  and  phase,  respectively).  Two  corresponding  response  area  parameters  (us¬ 
ing  tonal  stimuli)  were  also  measured:  the  excitatory  bandwidth  at  20  dB  above  threshold 
(BW20)  which  is  roughly  inversely  proportional  to  Do,  and  the  asymmetry  as  reflected  by  the 
directional  sensitivity  index  (C)  to  frequency- modulated  (FM)  tones  (which  is  proportional  to 
$o)-  The  response  parameters  measured  from  multiunit  records  corresponded  well  to  those 
obtained  from  single  units  in  the  same  cluster.  The  topographic  distribution  of  the  response 
parameters  across  the  surface  of  AI  was  studied  with  multiunit  recordings  in  four  animals.  In 
most  maps,  systematic  patterns  or  clustering  of  response  parameters  could  be  discerned  along 
the  isofrequency  planes.  The  distribution  of  the  characteristic  ripple  Do  exhibited  two  trends. 
First,  along  the  isofrequency  planes,  the  largest  values  were  grouped  in  one  or  two  clusters 
near  the  middle  of  AI,  with  smaller  values  found  towards  the  edges.  The  second  trend  oc- 
cured  along  the  tonotopic  axis  where  the  maximum  Do  found  in  an  isofrequency  range  increases 
with  increasing  BF.  The  distribution  of  the  characteristic  ripple  phase,  $o,  which  reflects  the 
asymmetry  in  the  response  field,  also  showed  a  clustering  along  the  isofrequency  axis.  At  the 
center  of  AI  symmetric  responses  ($o  ~  0)  predominated.  Towards  the  edges,  the  RFs  became 
more  asymmetric  with  $o  <  0  caudally,  and  $o  >  0  rostrally.  The  asymmetric  response  types 
tended  to  cluster  along  repeated  bands  that  paralleled  the  tonotopic  axis.  The  distribution  of 
the  response  area  measures  BW20  and  C-index  exhibited  similar  trends  along  the  isofrequency 
planes  as  Do  and  $o  • 

Details  of  this  work  are  found  in  the  manuscripts  entitled  “Ripple  Analysis  in  the  Primary 
Auditory  Cortex  (Part  II)”,  copies  of  which  are  appended  to  this  proposal. 

III.  Functional  Models  of  the  Auditory  System:  Psychoacoustics 

The  experimental  results  described  above  suggested  that  specific  features  of  the  shape  of 
the  acoustic  spectrum  are  being  extracted  and  mapped  in  the  cortex.  If  so,  then  it  is  likely  that 
important  consequences  must  exist  regarding  the  perception  of  such  spectra.  Very  little  direct 
evaluation  of  such  features  as  the  sensitivity  of  subjects  to  the  symmetry  of  spectral  peaks  and 
local  gradients  exist.  So  we  have  developed  experimental  set-ups  and  paradigms  with  the  help 
of  Dr.  David  Green  to  carry  out  such  experiments.  These  are  described  in  detail  in  the  two 
manuscripts  accompanying  this  proposal  entitled  “Representation  of  Spectral  Profiles  in  the 
Auditory  System:  Parts  I  and  II”. 

Over  the  last  year,  we  have  finished  a  series  of  experiments  on  the  perception  of  spectral 
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Figure  7a  (A)  Impulse  responses  of  three  filters  with  characteristic  npple  frequencies  =  1,3,  and 
8  cycle/ociave  and  characteristic  phase  =  0.  Filters  are  centered  ai  =  0  octave.  The  impulse 
response  is  computed  for  (r(Qcr)  =  0.3  (B)  Fourier  transform  of  the  three  impulse  responses  of 

the  filters  in  (a)  plotted  on  a  logarithmic  Q  axis. 
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Figure  7h  Profile  detection  tests  in  the  ripple  analysis  model.  (A)  The  dashed  line  is  a  polynomial 
approximation  to  the  perceptual  thresholds  measured  with  single  ripples  (reproduced  from  Fig.  3.27 
in  Hilher,  1991).  Each  data  point  (denoted  by  circles)  represents  the  peak  amplitude  of  r{Tt^)  due 
to  a  ]ust-deiectable  npple  with  frequency  The  solid  lines  represent  r(Q<,)  computed  for  the 

profiles  shown  in  (B).  (B)  The  alternating,  step,  and  single  component  increment  profiles  at  their 
jusi-deteciable  amplitudes  according  to  Bernstein  and  Green  (1937). 
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peak  shapes,  specifically  their  symmetry  and  bandwidth.  The  results  we  obtained  could  be 
explained  elegantly  by  a  model  of  auditory  spectral  profile  perception  that  utilizes  spectral 
ripples  as  the  elementary  representational  features.  In  particular,  the  model  assumes  that 
the  spectral  profile  is  represented  in  the  auditory  system  by  a  weighted  sum  of  sinusoidally 
modulated  spectra  (ripples).  The  analysis  is  performed  by  a  bank  of  bandpass  filters,  each 
tuned  to  a  particular  ripple  frequency  and  ripple  phase  (Fig. 7a).  The  parameters  of  the  model 
are  estimated  using  data  from  single  ripple  detection  experiments.  The  model  is  then  used  to 
account  for  detection  thresholds  of  more  complex  profiles  such  as  the  step,  single  component 
increment,  and  the  alternating  profiles  (Fig.Tb).  Physiological  and  psychophysical  evidences 
from  the  auditory  and  visual  systems  in  support  of  this  type  of  a  model  are  reviewed  in  detail 
in  the  accompanying  manuscripts. 

Based  on  the  ripple  analysis  model,  detection  thresholds  of  of  shape  changes  in  spectral  peak 
profiles  were  then  interpreted.  Peak  shape  is  uniquely  described  in  terms  of  two  parameters: 
bandwidth  factor  (BWF)  which  reflects  the  sharpness  of  a  peak,  and  a  symmetry  factor  (SF) 
which  roughly  measures  the  local  evenness  or  oddness  of  a  peak.  Using  profile  analysis  methods, 
thresholds  to  changes  in  these  parameters  (defined  as  5BWF/BWF  and  <5SF)  are  measured 
together  with  the  effects  of  several  manipulations  such  as  using  different  peak  levels,  varying 
spectral  component  densities,  and  randomizing  the  frequencies  of  the  peaks.  The  ripple  analysis 
model  could  account  well  for  the  measured  thresholds  (Fig.8).  Predictions  of  three  previously 
published  models  for  the  same  profiles  were  also  evaluated  and  discussed. 
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Fi<^ure  So,  (A)  Profiles  of  two  symmeinc  peaks  (described  m  detail  m  Vranic-Sowers  and  Shamma 
I9^4a)  The  3Wf  measures  the  3dB~handwidih  of  a  peak.  (3)  The  npvle  transform  magnitudes, 
r{Q.hofihe  peaks  m  (A).  The  dashed  lines  denote  the  locations  of  the  steepest  low  pass  edges  in 
r(n^).  The  effect  of  a  3WF  change  is  a  shift  (and  not  a  change  in  shapej  of  the  ripple  transform 
alolg  the  logarithmic  Ho  ar:5.  For  example,  a  fourffold  increase  in  BWF  (from  0.1  to  0.4  octaves), 
i.e.,  SBWF/3WF=3  or  a  -  0.25  results  in  a  (A  -  logj  a  =)  2  octave  downward  shift  in  r(Qo). 


Fi<nire  3b  ('.•Ij  P:rc,plual  phase  difference.  at  threshold  (reproduced  from  Vranic-Sowers  and 

Shamma.  1994ij  for  various  apple  frequencies  Q,  for  161  frequency  components  and  at  25  dB  peak- 
to-valley  amviitude.  The  task  >s  to  detect  a  change  in  the  phase  of  a  apple  while  keeping  the  apple 
frequency  constant.  (B)  Shifting  the  phase  of  the  apple  transform  phase  of  the  peak  profile  by  a^ 
constant  angle  simply  changes  the  symmetry  of  the  profile.  Solid  line  is  a  symmetac  pean  (BWr 
=  0  4  octave)  with  avvie  transform  phase  =  0.  Dashed  tines  are  the  skewed  peaks  resulting  Irom 
adding  three  different  angles  to  the  apple  transform  phase  (Vranic-Sowers  and  Shamma.  1994a). 


Figure  8: 

IV.  Mathematical  Models  of  the  Auditory  Cortex 

With  these  experimental  data  in  hand,  we  then  developed  mathematical  models  of  the 
receptive  fields  and  analyzed  the  nature  of  the  responses  and  potential  features  encoded  by 
the  cortex.  The  model  suggests  that  the  auditory  system  analyzes  an  input  spectral  pattern 
along  three  independent  dimensions;  a  logarithmic  frequency  axis,  a  local  symmetry  axis,  and 
a  local  spectral  bandwidth  axis  (Fig.5).  It  is  shown  that  this  representation  is  equivalent 
to  performing  an  afl&ne  wavelet  transform  of  the  spectral  pattern  and  preserving  both  the 
magnitude  (a  measure  of  the  scale  or  local  bandwidth  of  the  spectrum)  and  phase  (a  measure 
of  the  local  symmetry  of  the  spectrum).  Such  an  analysis  is  in  the  spirit  of  the  cepstral  analysis 
commonly  used  in  speech  recognition  systems,  the  major  difference  being  that  the  double 
Fourier-like  transformation  that  the  auditory  system  employs  is  carried  out  in  a  local  fashion. 
Examples  of  such  a  representation  for  various  speech  and  synthetic  signals  are  discussed  in  the 
accompanying  paper  which  has  been  accepted  for  publication  in  the  IEEE  Audio  and  Speech 
Processing. 

V.  Analysis  of  Neural  Network  Architectures  with  Wavelet  Transforms 

The  goal  of  this  work  was  to  achieve  a  coherent  theoretical  foundation  for  a  class  of  neural 
network  architectures  called  feedforward  networks.  In  the  work  of  Y.C.  Pati  and  P.S.  Krish- 
naprasad,  it  was  shown  that  it  is  possible  to  structure  feedforward  networks  using  the  theory 
of  discrete  affine  wavelet  transforms.  In  particular,  it  was  shown  that  for  L‘^{R),  it  is  possible 
to  construct  frames  out  of  sigmoids,  and  hence  represent  elements  of  L^{R)  as  feedforward  net¬ 
works  with  one  hidden  layer.  The  dilations  and  shifts  that  appear  in  such  a  representation  are 
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determined  from  prior  knowledge  of  spatio-spectral  concentration  of  the  given  function/map.  A 
major  advantage  of  this  structuring  is  that  the  coefficients  to  be  fitted  from  data,  being  simply 
the  weights  from  the  hidden  layer  to  the  output  layer,  enter  linearly  in  the  model,  thereby 
leading  to  a  convex/quadratic  optimization  problem.  This  result,  being  among  the  very  first 
to  place  feedforward  networks  in  the  rigorous  context  of  wavelet  theory,  inspired  us  to  further 
investigate  the  use  of  wavelet  transforms. 

One  of  the  difficulties  in  extending  the  above  result  to  the  multidimensional  setting  of 
is  that  a  naive  approach  based  on  tensor  products  of  scalar  frames  can  lead  to  computation¬ 
intensive  formulations.  Further  it  was  not  quite  clear  how  to  work  out  the  frame  theory  in 
this  context,  -in  particular  certain  key  results  of  Daubechies  needed  to  be  extended.  In  the 
forthcoming  M.S.  thesis  of  T.  Kugarajah,  these  problems  are  solved,  and  furthermore,  the 
results  have  been  applied  to  the  problem  of  adaptive  control  in  nonlinear  systems.  This  goes 
quite  a  bit  beyond  the  current  literature  based  on  the  ad  hoc  use  of  radial  basis  functions  to 
model  nonlinear  vector  fields. 

In  continuation  of  our  effort  on  feedforward  networks,  we  started  looking  into  the  possibility 
of  similar  techniques  for  recurrent  networks  (in  continuous  time).  More  precisely,  we  started  a 
careful  study  of  the  the  problem  of  vector  field  approximations  in  nonlinear  dynamics  via  basis 
vector  fields.  Prior  efforts  ignore  the  fundamental  geometric  differences  between  vector  field 
approximations  and  function  approximations  (vector  fields  do  not  transform  under  coordinate 
change  in  the  same  way  as  scalar  fields).  We  note  for  instance  the  use  of  radial  basis  functions  in 
adaptive  control.  Our  effort,  based  on  an  understanding  of  geometric  approximation  techniques 
for  vector  fields  by  polynomial  vector  fields,  and  vector  wavelets  (a  subject  that  is  undergoing 
rapid  development  due  to  the  interests  of  researchers  in  fluid  mechanics  and  quantum  field 
theory)  is  likely  to  yield  new  insights  into  the  structure  of  recurrent  networks  and  more  generally 
locally  interacting  dynamical  systems.  A  graduate  research  assistant  supported  by  AFOSR 
(Herbert  Strumper)  is  involved  in  this  study,  as  a  part  of  his  Ph.D.  research. 

To  enhance  the  scope  of  our  research  in  wavelet  bases,  we  considered  the  problem  of  approx¬ 
imation  of  infinite-dimensional  linear  dynamical  systems  by  finite  dimensional  linear  systems. 
This  fundamental  problem,  arising  in  many  fields  of  applications  of  control  theory  ranging  from 
control  of  vibrating  structures,  to  control  of  heat-flow  in  furnaces,  to  cancellation  of  noise  in 
real-time  by  destructive  interference,  is  shown  to  have  an  elegant  solution  via  rational  wavelets 
in  the  Hardy  space  The  construction  of  such  bases  has  opened  the  way  for  a  wide  range 

of  applications  e.g.  in  the  use  of  smart  materials  to  carry  out  fast  identification  of  structural 
dynamics. 

A  direct  consequence  of  the  work  of  on  rational  wavelets  was  the  discovery  of  new  ways 
to  organize  the  design  problem  for  switched  capacitor  filters  for  the  auditory  studies  (Fig.9a). 
This  led  to  useful  collaboration  between  Daniel  Lin  (Ph.D.  student  of  Shamma)  and  Y.C.  Pati 
(Ph.D.  student  of  Krishnaprasad).  Further  applications  of  rational  wavelets  are  under  way. 

The  work  on  rational  wavelets  for  identfication  would  have  remained  a  theoretical  curiosity 
if  it  were  not  possible  to  do  fast  computation  of  such  rational  wavelet  models.  In  recent  work, 
new  recursive  algorithms  have  been  devised  for  systematic  approximation  via  basis  function 
representations.  The  new  algorithms  known  as  orthogonal  matching  pursuit  algorithms  are 
applicable  to  a  wide  class  of  problems,  ranging  from  fitting  radial  basis  function  approximations 
to  wavelet-based  models  for  transfer  functions  of  linear  systems.  These  new  algorithms  are 
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well-equipped  to  work  with  raw  data  as  well  as  data  subject  to  preliminary  processing.  It  is  of 
further  interest  that  these  algorithms  are  well-suited  to  the  exploitation  of  certain  forms  of  a 
priori  knowledge  (in  the  time-frequency  plane).  Several  papers  have  resulted  from  this  work, 
and  software  packages  are  now  available  for  the  use  of  such  algorithms  on  Sun  workstations. 
The  packages  include  a  commercial  MATLAB  based  toolbox  and  a  free  package  developed  at 
Stanford  and  Maryland.  These  come  with  effective  graphical  user  interfaces.  One  M.S.  student 
has  put  the  techniques  and  software  to  good  use  in  the  identification  of  the  dynamics  of  flexible 
beams  with  surface  mounted  piezo-electric  sensors  and  actuators  (so-called  smart  structures), 
(Fig.9b).  The  algorithms  are  fest  enough  to  merit  consideration  in  real-time  applications. 

The  recent  discoveries  in  thalamo-cortical  oscillations  have  led  to  the  suggestion  that  oscil¬ 
latory  neural  networks  are  playing  an  important  part  in  the  solution  to  the  so-called  ’’dynamic 
binding  problem”,  where  coherent  oscillations  encode  the  binding  together  of  features  of  an 
image  in  a  receptor  field.  Somewhat  influenced  by  these  exciting  developments,  we  undertook  a 
deep  study  of  the  properties  of  networks  of  oscillatory  neurons  (sometimes  called  rotor  neurons). 
We  have  a  better  understanding  of  the  mean-field  theory  of  a  class  of  such  networks.  We  have 
proved  new  convergence  theorems  using  arguments  based  on  LaSalle’s  invariance  principle.  We 
have  further  insights  into  asymptotic  behaviors.  New  implementations  in  analog  networks  are 
being  considered.  One  M.S.  student  (Eric  Justh)  is  involved  in  this  project.  He  is  however  an 
NSF  Graduate  Fellow  and  hence  does  not  need  support  from  AFOSR.  His  thesis  has  been  just 
completed  and  a  preprint  based  on  this  work  is  available  and  is  being  readied  for  submission 
to  a  journal. 

In  addition  to  the  papers  listed  below,  several  papers  are  under  preparation  including  one 
based  on  a  presentation  at  an  April  1994  Symposium  sponsored  by  the  National  Academy 
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Example  I:  Approximation  of  Cochlear  Filter  Response 


Normalized  (H2(n+))  Approximation  Error  Versus  Model-Order 


Example  II:  Identification  of  Flexible  Beam  (Rezaiifar  1993) 


Experimental  Flexible  Beam  Setup. 
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